Biomedical Literature Mining for Pharmacokinetics Numerical Parameter Collection

نویسندگان

  • Zhiping Wang
  • Luis Rocha
چکیده

BIOMEDICAL LITERATURE MINING FOR PHARMACOKINETICS NUMERICAL PARAMETER COLLECTION Model-based drug studies have been developing very fast recently. They require high quality pharmacokinetics (PK) parameter numerical data. However, most parameter measurements are still buried in the scientific literature. Traditional manual data extraction is too expensive to handle the exponentially growing number of publications. This thesis focuses on the application of text mining (TM) and machine learning (ML) for drug pharmacokinetics parameter data collection from the published literature. First, we explore the feasibility of TM on the extraction of drug PK parameter data from PubMed abstracts. Our method achieves higher precision and obtains rich information content. For the test drug Midazolam, it extracts 10 times more PK clearance data than the manually constructed commercial Drug Interaction Database (DiDB). Similar performance is obtained on additional test drugs. Following the success of TM on abstracts; we extended the methodology to full text articles and developed a literature mining pipeline for PK parameter data extraction. It combines machine learning, automatic information processing, and manual curation. It compromises four main components: (1) information retrieval, which applies both ontology-based name entity recognition (NER) and ML methods to classify PubMed search results; (2) article downloading of full PDF articles through PubMed external links; (3) information extraction of PK data from both tables and free text of articles; and (4) transformation and storage of mined information, so that it can be reachable in a drug-modeling-friendly manner. This literature mining pipeline and methodology is the first working approach to extract numerical data from full text articles, capable of processing both plain text and tabular data. The specific contributions of this thesis include:  A new PK ontology for entity template construction  Comparison of NLP and machine learning algorithms for PK information retrieval  Tabular data extraction  PK information extraction from full text literature  A complete pipeline of numerical data extraction from both abstracts and full-text articles for pharmacokinetics

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Generalized Association Rules on Biomedical Literature

The discovery of new and potentially meaningful relationships between concepts in the biomedical literature has attracted the attention of a lot of researchers in text mining. The main motivation is found in the increasing availability of the biomedical literature which makes it difficult for researchers in biomedicine to keep up with research progresses without the help of automatic knowledge ...

متن کامل

A Tree Kernel-Based Method for Protein-Protein Interaction Mining from Biomedical Literature

As genomic research advances, the knowledge discovery from a large collection of scientific papers becomes more important for efficient biological and biomedical research. Even though current databases continue to update new protein-protein interactions, valuable information still remains in biomedical literature. Thus data mining techniques are required to extract the information. In this pape...

متن کامل

Collection-Wide Extraction of Protein-Protein Interactions

Evidence in support of relationships among biomedical entities, such as protein-protein interactions, can be gathered from a multiplicity of sources. The larger the pool of evidence, the more likely a given interaction can be considered to be. In the context of biomedical text mining, this elementary observation can be translated into an approach that seeks to find in the literature all availab...

متن کامل

Concept Chain Graphs: A Hybrid IR Framework for Biomedical Text Mining

The area of biomedical text mining has seen much research activity due to the increased volume of literature that must be examined. Researchers need to validate and interpret their experimental results; this entails scouring through a massive amount of potentially relevant literature for clues that may shed light on their findings. An ideal situation would allow a user to interactively search t...

متن کامل

Integrating Biomedical Text Mining Services into a Distributed Workflow Environment

Workflows are useful ways to support scientific researchers in carrying out repetitive analytical tasks on digital information. Web services can provide a useful implementation mechanism for workflows, particularly when they are distributed, i.e., where some of the data or processing resources are remote from the scientist initiating the workflow. While many scientific workflows primarily invol...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013